Accurate and comprehensive sequencing of personal genomes.

نویسندگان

  • Subramanian S Ajay
  • Stephen C J Parker
  • Hatice Ozel Abaan
  • Karin V Fuentes Fajardo
  • Elliott H Margulies
چکیده

As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ∼30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses of a clinical sample sequenced on two related Illumina platforms, GAII(x) and HiSeq 2000, to a very high depth (126×). We used these data to establish genotype-calling filters that dramatically increase accuracy. We also empirically determined how the callable portion of the genome varies as a function of the amount of sequence data used. These results help provide a "sequencing guide" for future whole-genome sequencing decisions and metrics by which coverage statistics should be reported.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TIARA: a database for accurate analysis of multiple personal genomes based on cross-technology

High-throughput genomic technologies have been used to explore personal human genomes for the past few years. Although the integration of technologies is important for high-accuracy detection of personal genomic variations, no databases have been prepared to systematically archive genomes and to facilitate the comparison of personal genomic data sets prepared using a variety of experimental pla...

متن کامل

Systematic analysis and functional annotation of variations in the genome of an Indian individual.

Whole genome sequencing of personal genomes has revealed a large repertoire of genomic variations and has provided a rich template for identification of common and rare variants in genomes in addition to understanding the genetic basis of diseases. The widespread application of personal genome sequencing in clinical settings for predictive and preventive medicine has been limited due to the lac...

متن کامل

HapEdit: an accuracy assessment viewer for haplotype assembly using massively parallel DNA-sequencing technologies

The massively parallel sequencing technologies have recently flourished and dramatically cut the cost to sequence personal human genomes. Haplotype assembly from personal genomes sequenced using the massively parallel sequencing technologies is becoming a cost-effective and promising tool for human disease study. Computational assembly of haplotypes has been proved to be very accurate, but obvi...

متن کامل

Comment on "A Database of Human Immune Receptor Alleles Recovered from Population Sequencing Data".

High-throughput sequencing data from TCRs and Igs can provide valuable insights into the adaptive immune response, but bioinformatics pipelines for analysis of these data are constrained by the availability of accurate and comprehensive repositories of TCR and Ig alleles. We have created an analytical pipeline to recover immune receptor alleles from genome sequencing data. Applying this pipelin...

متن کامل

The personal genome browser: visualizing functions of genetic variants

Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome B...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Genome research

دوره 21 9  شماره 

صفحات  -

تاریخ انتشار 2011